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Abstract 



We study the relationship between the competitive ratio and the tail distribution of 
randomized online minimization problems. To this end, we define a broad class of online 
problems that includes some of the well-studied problems like paging, fc-server and metrical 
task systems on finite metrics, and show that for these problems it is possible to obtain, given 
an algorithm with constant expected competitive ratio, another algorithm that achieves the 
same solution quality up to an arbitrarily small constant error a with high probability; the 
"high probability" statement is in terms of the optimal cost. Furthermore, we show that our 
assumptions are tight in the sense that removing any of them allows for a counterexample to 
the theorem. In addition, there are examples of other problems not covered by our definition, 
where similar high probability results can be obtained. 

1 Introduction 

In online computation, we face the challenge of designing algorithms that work in environments 
where parts of the input are not known while parts of the output (that may heavily depend 
on the yet unknown input pieces) are already needed. The standard way of evaluating the 
quality of online algorithms is by means of competitive analysis, where one compares the outcome 
of an online algorithm to the optimal solution constructed by a hypothetical optimal offline 
algorithm. Since deterministic strategies are often proven to fail for the most prominent problems, 
randomization is used as a powerful tool to construct high-quality algorithms that outperform 
their deterministic counterparts. These algorithms base their computations on the outcome 
of a random source; for a detailed introduction to online problems we refer the reader to the 
literature 

The most common way to measure the performance of randomized algorithms is to analyze the 
worst-case expected outcome and to compare it to the optimal solution. With offline algorithms, a 
statement about the expected outcome is also a statement about the outcome with high probability 
due to Markov's inequality and the fact that the algorithm may be executed many times to 
amplify the probability of success [o]. However, this amplification is not possible in online settings. 
As online algorithms only have one attempt to compute a reasonably good result, a statement 
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with respect to the expected value of their competitive ratio may be rather unsatisfying. As a 
matter of fact, for a fixed input, it might be the case that such an algorithm produces results of 
a very high quality in very few cases (i. e., for a rather small number of random choices), but 
is unacceptably bad for the majority of random computations; still, the expected competitive 
ratio might suggest a better performance. Thus, if we want to have a certain guarantee that 
some randomized online algorithm obtains a particular quality, we must have a closer look at its 
analysis. In such a setting, we would like to state that the algorithm does not only perform well 
on average, but "almost always." 

Besides a theoretical formalization of the above statement, the main contribution of this paper 
is to show that, for a broad class of problems, the existence of a randomized online algorithm 
that performs well in expectation immediately implies the existence of a randomized online 
algorithm that is virtually as good with high probability. Our investigations, however, need 
to be detailed in order to face the particularities of the framework. First, we show that it is 
not possible to measure the probability of success with respect to the input size, which might 
be considered the straightforward approach. Many of the known randomized online algorithms 
are naturally divided into some kind of phases (e.g., the algorithm for metrical task systems 
from Borodin et al. js], the marking algorithm for paging from Fiat et al. etc.) where each 
phase is processed and analyzed separately. Since the phases are independent, a high probability 
result (i.e., with a probability converging to 1 with an increasing number of phases) can be 
obtained. However, the definition of these phases is specific to each problem and algorithm. Also, 
there are other algorithms (e.g., the optimal paging algorithm from Achlioptas et al. [2] and 
many workfunction-based algorithms) that use other constructions and that are not divided 
into phases. As we want to establish results with high probability that are independent of the 
concrete algorithms, we thus have to measure this probability with respect to another parameter; 
we show that the cost of an optimal solution is a very reasonable quantity for this purpose. 

Then again it turns out that, if we consider general online problems, the notions of the 
expected outcome and an outcome with high probability are still not related in any way, i. e., we 
define problems for which these two measures are incomparable. Hence, we carefully examine 
both to which parameter the probability should relate and which properties we need the studied 
problem to fulfill to again allow a division into independent phases; finally, this allows us to 
construct randomized online algorithms that perform well with a probability tending to 1 with a 
growing size of the optimal cost. We show that this technique is applicable for a wide range of 
online problems. 

Classically, results concerning randomized online algorithms commonly analyze their expected 
behavior; there are, however, a few exceptions, e. g., Leonardi et al. 14 analyze the tail distribution 



of algorithms for call control problems, and Maggs et al. [15] deal with online distributed data 
management strategies that minimize the congestion in certain network topologies. 



Overview of this Paper 

In Section [2] we define the class of symmetric online minimization problems and present the main 
result (Theorem [1]). The theorem states that, for any symmetric problem which fulfills certain 
natural conditions, it is possible to transform an algorithm with constant expected competitive 
ratio r to an algorithm having a competitive ratio of (1 + e)r with high probability (with respect 
to the cost of an optimal solution) . Section [s] is devoted to proving Theorem [T] We partition 
the run of the algorithm into phases such that the loss incurred by the phase changes can be 
amortized; however, to control the variance within one phase, we need to further subdivide 
the phases. Modelling the cost of single phases as dependent random variables, we obtain a 
supermartingale that enables us to apply the Azuma-Hoeffding inequality and thus to obtain 
the result. These investigations are followed by applications of the theorem in Section |4] where 
we show that our result is applicable for task systems and that for the fc-server problem on 
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unbounded metric spaces, no comparable result can be obtained. We further elaborate on the 
tightness of our result in Section [5j 



2 Preliminaries 

We use the following definitions of online algorithms j4] that deal with online minimization 
problems. 

Definition 1 (Online Algorithm). Consider an initial configuration / and an input sequence 
X = (xi, . . . , Xn)- An online algorithm A computes the output sequence A(/, x) = (yi, . . . , 
where yi = /(/, xi, . . . for some function /. The cost of the solution A{I,x) is denoted by 
Cost7,^(A). 

For the ease of presentation, we refer to the tuple that consists of the initial configuration and 
the input sequence, i. e., (/, x), as the input of the problem. Even though the initial configuration 
is not explicitly introduced in the definition in p], it is often very natural, and it is used in the 
definitions of some well-known online problems (e. g., the /c-server problem [l3]). As we see later, 
the notion of an initial configuration plays an important role in the relationship between different 
variants of the competitive ratio. 

Since, for the majority of online problems, deterministic strategies are often doomed to fail in 



terms of their output quality, randomization is used in the design of online algorithms [4||9 , 11 
Formally, randomized online algorithms can be defined as follows. 

Definition 2 (Randomized Online Algorithm). A randomized online algorithm R computes 
the output sequence R'^(/, x) = [yi, . . . , yn) such that yi is computed from xi, ... , Xj, where 
(j) is the content of the random tape, i. e., an infinite binary sequence where every bit is chosen 
uniformly at random and independent of the others. By Cost/^a;(R) we denote the random variable 
(over the probability space defined by (j)) expressing the cost of the solution R'^(/, x). 

The efficiency of an online algorithm is usually measured in terms of the competitive ratio as 
introduced by Sleator and Tarjan 



Definition 3 (Competitive Ratio). An online algorithm is r- competitive, for some r > 1, if 
there exists a constant a such that, for every initial configuration / and each input sequence x, 
Cost/^3;(A) < r ■ Cost/_a;(OPT) + a, where Cost/^x'(OPT) denotes the value of the optimal solution 
for the given instance; an online algorithm is optimal if it is 1-competitive with a = 0. 

When dealing with randomized online algorithms we compare the expected outcome to the 
one of an optimal algorithm. 

Definition 4 (Expected Competitive Ratio). A randomized online algorithm R is r-com- 
petitiv^ in expectation if there exists a constant a such that, for every initial configuration / 
and input sequence x, E[Costj-^2:(R)] < r • Cost/^2^(0PT) + a. 

In the sequel, we analyze the notion of competitive ratio with high probability. Before stating 
the definition, however, we quickly discuss what parameter the high probability should relate 
to. As already mentioned, a natural way would be to define an event to have high probability 
if the probability that it appears tends to 1 with increasing input length (i.e., the number 
of requests). However, this does not seem to be very useful; consider, e.g., the well-known 
paging problem [4,11 with cache size k (we describe and study paging more thoroughly in 



Subsection 4.3): For any input x of length n and any competitive ratio r and any d, there is an 



^The notion of competitiveness for randomized online algorithms as used in this paper is called competitiveness 
against an oblivious adversary in the literature. For an overview of the different adversary models, see, e.g., |4j. 
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input x' of length dn formed by repeating every request d times. Hence, for any algorithm, the 
performance on x and x' is the same[^ This implies that there is no randomized algorithm for 
paging that achieves a competitive ratio of less than k with a probability approaching 1 with 
growing n. Let r <k and suppose that there exists no S N and a randomized online algorithm R 
that, for any input x with = n > no is r-competitive with probability 1 — l//(n), for some 
function / that tends to infinity with growing n. Thus, there is a randomized online algorithm 
R' that is r-competitive on every instance x' independent of its length with this probability. In 
particular, if there exists such an algorithm, then there exists a randomized online algorithm 
C that is r-competitive on instances of length k with probability 1 — l//(n), for any n. Now 
consider the following instance that consists of k requests and let the cache be initialized with 
pages 1, . . . , fc; an adversary requests page A; -|- 1 at the beginning and a unique page in the next 
k — 1 time steps. Clearly, there exists an optimal solution with cost 1. In every time step in 
which a page fault occurs, C, using its random source, chooses a page to evict to make space in 
the cache. Since the adversary knows C's probability distribution, without loss of generality, we 
assume that C chooses every page with the same probability. Note that there exists a sequence 
pi, . . . ,Pk of "bad" choices that causes C to have cost k. In the first time step, C chooses the bad 
page with probability at least 1/k; with probability at least 1/k'^, it chooses the bad pages in 
the first and the second time step and so on. Clearly, the probability that it chooses the bad 
sequence is at least 1/k^. But this immediately contradicts that C performs well on this instance 
with probability 1 -|- l//(n), for arbitrarily large n. 

Then again, for the practical use of paging algorithms, the instances where also the optimal 
algorithm makes faults are of interest. Hence, it seems reasonable to define the term high 
probability with respect to the cost of an optimal solution. In this paper, we use a strong notion 
of high probability requiring the error probability to be subpolynomial. 

Definition 5 (Competitive Ratio w.h.p.). A randomized online algorithm R is r-competitive 
with high probability (w.h.p. for short) if, for any /3 > 1, there exists a constant a such that for 
all initial configurations and inputs (/, x) it holds that 



Pr[Cost7,a;(R) > r • Cost/_a;(OPT) + a] < {2 + Cost/,^(OPT)) 
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First, note that the purpose of the constant 2 on the right-hand side of the formula is to 
properly handle inputs with a small (possibly zero) optimum. The choice of the particular 
constant is somewhat arbitrary (however, it should be greater than 1) since the a term on the 
left-hand side hides the effects. We now show that the two notions of the expected and the 
high-probability competitiveness are incomparable. Let [n] denote the set {1, . . . ,n}. 

1. On the one hand, there are problems for which the competitive ratio w.h.p. is better 
than the expected one. Consider, e.g., the following problem. There is a unique initial 
configuration / and the input sequence consists of n + 1 bits xq = An 
online algorithm has to produce one-bit answers yo, . . . ,yn+i- If, for every i S [n], it 
holds that yi-i = Xi, the cost is = 2^", otherwise the cost is Y17=o -^i^ which is optimal. 
A straightforward algorithm that guesses each bit with probability 1/2 has probability 
1 — 1/2" to be optimal on every input. 

Consider some /? > 1; let be the smallest integer such that 2"'^ > (n^ -|- 3)^ and let 
a = 2-^"/'. For any input of length n > np we have 

Pr[Cost7,^(R) > Cost/,^(OPT)] <^< , < ^ 



2" - (n + 3)/' - (2 + Cost/,^.(OPT))/5' 



^This is true if we assume that the algorithm only deletes a page from its buffer if a page fault occurs, which is 



implied by the problem definition, see 11 
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For inputs of length at most n^, any solution has a cost of at most a, so 

Pr[Cost/,a:(R) > a] = 0. 
Hence, the algorithm is 1-competitive w.h.p. 

However, for any algorithm, there is an input such that the probability of guessing the 
whole sequence is at least 1/2"', so the expected cost is at least Z)/2". Since the optimum 
is at most n + 1 , any algorithm has an expected competitive ratio of at least 

D _ 2" 
(n+ 1)2" ~ n+1' 

2. On the other hand, the following problem shows that sometimes the expected performance 
is better than the one we get w.h.p. In fact, we show that the gap between these two 
measures can be arbitrarily large. Consider a problem with n requests, where the first n — 1 
ones are just dummy requests that serve for padding, and the last one is x„ G {1, . . . , 6} 
for some positive integer b that depends on n. The answer yi has to be a number from the 
interval {1, . . . , 6}. The cost is n yi 7^ x„, and bn otherwise. An algorithm that chooses 
yi uniformly at random pays n with probability (6 — and bn with probability 1/6; 
hence the expected cost is n(l + (6 — l)/6) < 2n. However, there is always an input such 
that the probability to pay bn is at least 1/6. For any k, we can choose 6 := n^. Then, 
no algorithm can achieve a solution with cost better than n^~^^ with probability at least 
1 — Since the optimal cost is n, there is no algorithm with competitive ratio n'^ 

w.h.p., but there is one with an expected competitive ratio of 2. 

However, the problems used in the previous examples were quite artificial; many real-world 
online problems share additional properties that guarantee a closer relationship between the 
expected and high-probability behavior. In what follows, we thus focus on so-called partitionable 
problems. 

Definition 6 (Partitionability). An online problem is called partitionable if there is a non- 
negative function V such that, for any initial configuration /, the sequence of requests xi, . . . , Xn, 
and the corresponding solutions yi , . . . , , we have 

n 

Costi ^x{yi,. .. ,yn) = '^V{I,xi,...,Xi;yi,...,yi). 

i=l 

In other words, for a partitionable problem, the cost of a solution is the sum of the costs of 
particular answers, and the cost of each answer is independent of the future input and output. 
The partitionability allows us to speak of the cost of a subsequence of the outputs. A problem 
can only fail to be partitionable if the cost may decrease with additional request- answer pairs. 
We can, however, transform every online problem into a partitionable one by introducing a 
dummy request at the end as a unique end marker. This way, we can assign a value of zero to 
all answers but the last one. Therefore, the partitionability condition stated in this way causes 
no restriction on the online problem. However, we further restrict the behavior, and it will be 
convenient to think in terms of the "cost of a particular answer." 

Definition 7 (Request-Boundedness). Let the function V be defined as in Definition [6| A 
partitionable problem is called request-bounded if, for some constant F, we have 

\/I,x,y,i: V{I,xi, . . . ,Xi;yi, . . . ,yi) < F or V{I,xi, . . . ,Xi;yi, . . . ,yi) = 00. 
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Note that for any partionable problem there is a natural notion of a state; for instance, 
it is the content of the memory for the paging problem, the position of the servers for the 
fc-server problem, etc. Now we provide a general definition of this notion. By a - b, we denote the 
concatenation of two sequences a and b; A denotes the empty sequence. 

Definition 8 (State). Consider two initial configurations / and two sequences of requests 
X = (xi, . . . , Xn) and x' = {x'^, . . . , x^), and two sequences of outputs y = {yi, . . . , yn) and 
y' = {y'l, . . . ,y'^). The triples {I,x,y) and {I',x',y') are equivalent if, for any sequence of 
requests x" = (x", . . . , x^) and a sequence of outputs y" = {y'{, . . . , y'p), the input (/, x • x") is 
valid with a solution y ■ y" if and only if the input (/', x' ■ x") is valid with a solution y' ■ y" , and 
the cost of y" is the same for both solutions. 

A state s of the problem is an equivalence class over the triples (/, x, y). Let (/, x, y) be some 
triple in s. By OpTs(x') we denote the sequence of outputs y' such that y • y' is a valid solution of 
the input (I, x • x') and the cost Costj^a;'(OPT<j(x')) of y' is minimal, where J is the configuration 
determined by (/, x,y). A state s is an initial state if and only if it contains some triple (/, A, A). 

We chose this definition of states as it covers best the properties of online computations as we 
need them in our main theorem. An alternative definition could use task systems with infinitely 
many states, but the description would become less intuitive; we will return to task systems in 
Section lO 

From now on we sometimes slightly abuse notation and write Cost(OPTs(x')) instead of 
Costj^i:/(OPTs(x')) if the configuration J corresponds to a triple in s, as it is sufficient to know 
the state s instead of J in order to determine the value of the function. Intuitively, a state from 
Definition [8] encapsulates all information about the ongoing computation of the algorithm that 
is relevant for evaluating the efficiency of the future processing. Usually, the state is naturally 
described in the problem-specific domain (content of cache, current position of servers, set of 
jobs accepted so far, etc.). Note, however, that the internal state of an algorithm is a different 
notion since it may, e.g., behave differently if the starting request had some particular value. 
The following properties are crucial for our approach to probability amplification. 

Definition 9 (Opt-Boundedness). A partitionable online problem is called opt-bounded if 
there exists a constant B such that Vs,s',x: |Cost(OPTs(x)) — Cost(OPTy (x))| < B. 

Note that the definition of opt-boundedness implies that any request sequence x is valid. In 
particular, the request sequence may end at any time. 

Definition 10 (Symmetric Problem). An online problem is called symmetric if it is parti- 
tionable and every state is initial. 

Formally, any partitionable problem may be transformed into a symmetric one simply by 
redefining the set of initial states. However, this transformation may significantly change the 
properties of the problem. Now we are going to state the main result of this paper, namely that, 
under certain conditions, the expected competitive ratio of symmetric problems can be achieved 
w.h.p. 

Theorem 1. Consider a opt-bounded symmetric online problem for which there is a randomized 
online algorithm A with constant expected competitive ratio r. Then, for any constant e > 0, 
there is a randomized online algorithm A' with competitive ratio (1 + e)r w.h.p. (with respect to 
the optimal cost). 

We prove this theorem in the subsequent section. 
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3 Proof of Theorem [1] 



For ease of presentation, we first provide a proof for a restricted setting where tlie online problem 
at hand is also request-bounded. 

The algorithm A' simulates A and, on some specific places, performs a reset operation: if a 
part x' of the input has been read so far, and a corresponding output y' has been produced, 
(/, x' , y') belongs to the same state as (/', A, A), for some initial configuration I', because we are 
dealing with a symmetric problem; hence, A can be restarted by A' from 

The general idea to boost the probability of acquiring a low cost is to perform a reset each 
time the algorithm incurs too much cost and to use Markov's inequality to bound the probability 
of such an event. However, the exact value of how much is "too much" depends on the optimal 
cost of the input which is not known in advance. Therefore, the input is first partitioned into 
phases of a fixed optimal cost, and then each phase is cut into subphases based on the cost 
incurred so far. A reset may cause an additional expected cost of r ■ B for the subsequent phase 
compared to an optimal strategy starting from another state, where B is the constant of the 
opt-boundedness (Definition [o]) , i.e., B bounds the different costs between two optimal solutions 
for a fixed input for different states. We therefore have to ensure that the phases are long enough 
so as to amortize this overhead. 

From now on let us consider e, r, B, and F to be fixed constants; recall that F originates from 
the request-boundedness property of the online problem at hand (Definition [?]). The algorithm A' 
is parameterized by two parameters C and D that depend on e, r, B, and F. These parameters 
control the length of the phases and subphases, respectively, such that C + F delimits the optimal 
cost of one phase and D + F delimits the cost of the solution computed by A' on one subphase; 
we require that D > r(C + F + B). 

Consider an input sequence x = (xi, . . . an initial configuration I, and let the optimal 
cost of the input (/, x) be between {k — l)C and kC for some integer k. Then x can be partitioned 
into k phases xi = (xi, . . . ,Xn2-i), ^2 = {xn2^ ■ ■ ■ -.Xn^-i), ■ ■ . ,Xk = {xuk-,- ■ ■ ) in such a way 
that rij is the minimal index for which the optimal cost of the input (/, (xi, . . . , is at least 
{i — \)C. It follows that the optimal cost for one phase is at least C — F and at most C + F, with 
the exception of the last phase which may be cheaper. Note that this partition can be generated 
by the online algorithm itself, i. e., A' can determine when a next phase starts. There are only 
two reasons for A' to perform a reset: at the beginning of each phase and after incurring a cost 
exceeding D since the last reset. Hence, A' starts each phase with a reset, and the processing of 
each phase is partitioned into a number of subphases each of cost at least D (with the exception 
of the possibly cheaper last subphase) and at most D + F. 

Now we are going to discuss the cost of A' on a particular input. Let us fix the input (I, x) 
which subsequently also fixes the indices 1 = ni , 71-2 , . . . , . Let Si be a random variable denoting 
the state of the problem (according to Definition [s]) just before processing request Xj, and let 
W{i,j)^i<j,he& random variable denoting the cost of A' incurred on the input Xj, . . . , Xj. The 
following claim is obvious. 

Claim 1. If A' performs a reset just before processing Xj, then Si captures all the information 
from the past W{i^j) depends on. In particular, if we fix Si = s, W{i,j) does not depend on 
W{li, I2), for any h < h ^ i and any state s. 

The overall structure of the proof is as follows. We first show in Lemma [2] that the expected 
cost incurred during a phase (conditioned by the state in which the phase was entered) is at most 
/U := r(C + F + i?)/(l — p), where p := r(C + F + B)/ D < 1. We can then consider variables 
Zq, Zi, . . . , Zf: such that 

i 

Zq := kfi, Zi := (k — + Wj for i > 0, 
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where Wi is the cost of the ith phase, dipped from above by some logarithmic bound, i. e., 

Wi := mm{W{ni,ni+i - 1), clog A;}, 

for some suitable constant c. We show in Lemma [3] that Zq, Zi, . . . , form a bounded super- 
martingale, and then use the Azuma-Hoeffding inequality to conclude that is unlikely to be 
much larger than Zq. By a suitable choice of the free parameters, this implies that Zk is unlikely 
to be much larger than the expected cost of A. Finally, we show that w.h.p. Zk is the cost of the 
algorithm A'. 

In order to argue about the expected cost of a given phase in Lemma [2j let us first show that 
a phase is unlikely to have many subphases. For the rest of the proof, let Xj be the random 
variable denoting the number of subphases of phase j . 

Lemma 1. For any i, s, and any 6 €z¥l it holds that Pr[Xj > 6 \ Sm = s] < p^^^ . 

Proof. The proof is done by induction on 6. For 5 = 1 the statement holds by definition. Let ric 
denote the index of the first request after c — 1 subphases, with ni = nj, and nc = oo if there are 
less than c subphases. In order to have at least 6 > 2 subphases, the algorithm must enter some 
suffix of phase i at position ns-i and incur a cost of more than D (see Fig. [T]). Hence, 

Pr[X, >5\Sn, = s]= Pi[ns-i < Ui+i -l\Sn,=s] (1) 

• Pr[I^(n5_i,ni+i - I) > D \ ns-i < rij+i - 1 A 5„. = s]. 

The fact that ns-i < nj+i — 1 means that there are at least 5 — 1 subphases, i.e.. 




Figure 1: The situation with S subphases. 



Fr[ns-i < ni+i - 1 \ Sn, = s] = Fr[X, > 5 - 1 \ Sn, = s] < p^-^ (2) 

by the induction hypothesis. Further, we can decompose 

Pr[W{ns-i,ni+i - 1) > D \ ns^i < n^+i - 1 A Sn, = s] (3) 
Fr[W{ns-i,ni+i - 1) > D \ ns-i = i' A Si' = s' A 5„, = s] 

"i<*'<"i+i-i 

• Pr[ns-i =i' A Si> = s' \ fis^i < n^+i - 1 A Sn, = s]. 

Now let us argue about the probability 

Pr[VF(n5_i, nj+i - 1) > D \ ns-i = i' A Si> = s' A S'„. = s]. 

The algorithm A' performed a reset just before reading Xj', so it starts simulating A from state s' . 
However, in the optimal solution, there is some state s" associated with position i' such that the 
cost of the remainder of the ith phase is at most C + F. Due to the assumption of the theorem, 
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the optimal cost on the input Xj/, . . . , starting from state s' is at most C + F + B, and 

the expected cost incurred by A is at most r(C + F + B). Using Markov's inequahty, we get 

Pv[W{ns-i,n,+i -1)>D\ ns^i = i' A 5^, = s'] < ^('^ +^ + = ^. (4) 

Plugging Q into ([s]), and then together with ^ into ([T]) yields the result. 

□ 

Now we can argue about the expected cost of a phase. 
Lemma 2. For any i and s it holds that lE[VF(?7-j, n^+i — \ Si = s] < ^. 

Proof. Let ric be defined as in the proof of Lemma [l} Using the same arguments, we have that 
the expected cost of a single subphase is 

E[T^(nc,min{nc+i,ni+i - 1}) | ric = i' A S.;, = s'] < r{C + F + B). 

Conditioning and decomposing by ric and s' , we get that 

E[M^(n„min{ne+i,ni+i - 1}) | > c] < r{C + F + B). 

Finally, let Qi^c = W{nc, min{nc+i, rij+i — 1}) if Xi > c, or if Xi < c. This gives 

00 

nW{ni,n,+i - 1) I 5, = s] = ^E[Qi,, \ Si = s] 

c=l 

00 

= ^nQ^,c \S, = sAXi>c]- Ft[X, > c] 

c=l 

00 

< Y,r{C + F + B)p^'^ = r{C + F + B)/{l-p). 

c=l 

□ 

Once the expected cost of a phase is established, we can construct the supermartingale as 
follows. 

Lemma 3. For any constant c > 0, the sequence Zq, Zi, . . . , is a supermartingale. 

Proof. Consider a fixed c. We have to show that for each i, | Zq, . . . , Zi] < Zi. Prom the 

definition of the Zj's it follows that Zj+i — Zi = Wi+i — fi. Consider any elementary event 
from the probability space, and let Zi{^) = Zj, for i = 0, . . . , A: be the values of the corresponding 
random variables. We have 

E[Zj+i I Zq, . . . , Zj](^) = E[Zi+i I Zq = Zo, . . . , Zi = Zi] 

= Ei[Zi + Wi+i - ^ \ Zq = Zq, . . . , Zi = Zi] = Zi - H + Ei[Wi+i \ Zq = Zq,. . . ,Zi = Zi] 

= Zi - + ^[^i+l \ Zo = Zo,...,Zi = Zi, Sn,+ i = S] 
■ Pr[S'„^^, = S \ Zq = Zq, . . . ,Zi = Zi] 

< Zi - + E[M^(ni+i, - 1) I Sn,+-, = s] ■ Pr[5„^^j = s \ Zq = zq, . . . , Zi = Zi] 

< Zi - fi + fj,^^ Pr[S'„.^-^ = s \ Zq = zo, ■ ■ . , Zi = Zi] = Zi = Zi{£), 

where the last inequality is a consequence of Lemma [2} 

□ 



9 



Now we can use the following special case of the Azuma-Hoeffding inequality [Tlls]. 



Lemma 4 (Azuma, HoefFding). Let Zq,Zi, . . . be a supermartingale, such that \Zi^i — Zi\ < 7. 
Then for any positive real t, 



Fr[Zk -Zo>t]< exp 



t2 



In order to apply Lemma [4j we need the following bound. 

Claim 2. Let k be such that clog A; > /x. For any i it holds that \Zi^i — Zi\ < clog A;. 

We are now ready to prove the subsequent lemma. 

Lemma 5. Let k be such that clog A; > fi. There is a constants C (depending on F, B, e, r) 
such that 



Pr[Zk > (1 + e)rA;C] < exp 



k{{l + e)rC-fif 



2c2 log^ k 

Proof. Applying Lemma |4] for any positive t, we get 

Pr[Z.-Zo>t]<exp(-^^-^ 

Noting that Zq = kfi, and choosing 
t := k{{l + e)rC - ^x) 

the statement follows. The only remaining task is to verify that t > 0, i.e., that there is a 
constant D such that 

{l + e)rC >r{C + F + B' ^ 



. _ r{C+F+B) ■ 
^ D 



Let us choose C such that C > . Then (1 + e)C > C + F + B, and it is possible to choose 
D such that both D > r{C + B -\- F) as required, and 

{l + eyC{C + B + F) 



r{{l + e)C-{C + B + F)y 
Thus, we have 

rD{l + e)C - rD{C + B + F) > {I + e)r^C{C + B + F) 
and therefore 

(1 + e)rC{D -r{C + B + F)) > rD{C + B + F) 
and the claim follows. 

□ 

To get to the statement of the main theorem, we show the following technical bound. 
Lemma 6. For any c, and (3 > 1 there is a ko such that for any k > 

2cHogH j-2{2 + kCf 
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Proof. Note that the left-hand side is of the form exp^— r/ ^^^^ for some positive constant 



Clearly, for any /3 > 1 and large enough k, it holds that exp( rj ^^^^ ^ ) > 2(2 + fcC)^. 



Combining Lemmata [5] and [6| we get the following result. 



□ 



Corollary 1. There is a constant C (depending on F , B, e, r) such that for any /3 > 1 there is 
a ko such that for any k > ko we have 

Fr[Zk > (1 + e)rkC] < 



2{2 + key 



In order to finish the proof of the main theorem we show that w.h.p., Z/. is actually the cost 
of the algorithm A'. 

Lemma 7. For any /? > 1 there is a c and a ki such that for any k > ki 



Pi[Zk / Cost(A')] < 



2{2 + kcy 



Proof. Since = ^^^^ min{VF(nj, nj+i — 1), clog A:} the event that 7^ Cost(A') happens 
exactly when there exists some j such that W{nj, %+i — 1) > clog k. 

Consider any fixed j. Since the cost of a subphase is at most D+F, it holds that W{nj,nj^i — 
1) < Xj{F + D). From Lemma [T] it follows that for any c, 



Pr[VK(nj, n^+i — 1) > clog A;] < Pr 
Consider the function 



c log k 
F + D 



c log k T 



g(k) :-- 



log( f {2 + kC)^ 
log k 



It is decreasing, and lim(fci_j.oo oik) = 1 + 13. Hence, it is possible to find a constant c, and a ki 
such that for any k > ki it holds that 



From that it follows that 



log(Mclog/c /i\ 

> logl -] + log(2k{2 + kC) 



F + D 



and 



log 



clog k 
F + D 



1^ > \og(2k{2 + kCf^ 



I.e., 



1\ 'p+o 



> 2k{2 + kC) 
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Thus, for this choice of c and ki, it holds that 



c log k -| 

Fr[W{nj,nj+i - 1) > clog A;] < p < 



Using the union bound, we conclude that the probability that the cost of any phase exceeds 
clogfe is at most 1/(2(2 + A;C)^). 

□ 

Using the union bound, combining Lemma [7] and Corollary [T| and noting that the cost of the 
optimum is at most kC, we get the following statement. 

Corollary 2. There is a constant C such that for any /3 > 1 there a k2 such that for any k > k2 
it holds 

Pr[Cost(A') > (1 + e)rCost(OPT)] < ^ 



{2 + kCf 

To conclude the proof by showing that for any /3 > 1 there is some a such that 
Pr[Cost(A') > (1 + e)rCost(OPT) + q] < ^ 



[2 + kcy 

holds for all fc, we have to choose a large enough to cover the cases of k < k2- For these cases, 
Cost(OPT) < /C2C, and hence the expected cost of A is at most rk2C, and due to Lemma [2| the 
expected cost of A' is constant. The right-hand side (2 + kC)~^ is decreasing in k, so it is at least 
(2 + k2C)~^ , which is again a constant. From Markov's inequality it follows that there exists a 
constant a such that 

1 



Pr[Co=t(A')><.]<p^^ 
finishing the proof of the restricted setting. 



3.1 Avoiding Request-Boundedness 

All that is left to do is to show how to handle problems that are not request-bounded. The main 
idea is to apply the restricted Theorem [l] to a modified request-bounded version of the given 
problem. We then have to show that there is a modified version of the algorithm such that the 
computed solution has an expected competitive ratio close to the original one for the modified 
problem. By ensuring that any solution to the modified problem translates to a solution of the 
original problem with at most the same competitive ratio, it is enough to apply our theorem to 
the modified problem to obtain an analogous result for the original problem. 

Let P be an opt-bounded symmetric problem; then P is described entirely by the feasible 
request-answer pairs (depending on the states), by its set of states S, and by costs of all request- 
answer pairs for all states. Note that an expected r-competitive online algorithm A for P has to 
have an expected competitive ratio of r for every request- answer pair. 

Let Costp{s, x,y) denote the cost to give y as answer on request x when in state s of the 
problem P. Let y be the set of all possible answers. Then we define the {a, t) -truncated version 
P' of P as follows. Let s be a state and £ be a request; we set 

m{s,x) := min{Costp(s, x, y)}, 

y&y 

i.e., the minimal cost to answer x when in state s. In P' we assign the cost Costp'(,s, x, y) = 
Costp(s, X, y), if m(s, x) < a and Costp'(s, x, y) = Costp(s, x, y) — m{s, x) + a otherwise. We 
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define all request- answer pairs of P' such that Costp/(s, x, y) > r to have a cost of cxd. Both P 
and P' have the same remaining feasible request- answer pairs for each state. 

Note that any algorithm that gives an answer of cost oo with nonzero probability cannot be 
competitive and that due to the modifications of the cost function, some distinct states of P 
may become a single state of P' . We will abuse notation and ignore this fact because it does not 
change the proof. Thus we assume that both problems have the same set of states. 

We continue with some insights that help us to choose useful values for a and r. 

Claim 3. Given an expected r-competitive algorithm A for P, for any 5 > there is a (r + 5)- 
competitive online algorithm C for P such that the cost Costp(s,x,y) for any y provided by C is 
at most 5~'^{a + r • {m{s, x) + B)). Furthermore, if m{s, x) > fiif-B, C may ignore the destination 
state and give a minimum cost answer greedily. 

Proof. Let s' be the state selected after s by an optimal solution and let s" be the state when 
giving a greedy answer of cost m{s,x). Let opt^, optg,, and opt^// be the costs of the respective 
optimal solutions when starting from s, s' , or s" . 

We first note that the optimal answer that leads from s to s' can have a cost of at most 
m(s, x) + B as otherwise, by the opt-boundedness, choosing greedily and moving to s" would be 
a better solution. 

The sum of probabilities of A to select an answer of cost at least k is at most (a + r • {m{s, x) + 
B))/k where the parameter a is due to the definition of the competitive ratio. Otherwise the 
expected value would be too high if the adversary chooses to only send a single request. We set 
K = 6~^{a + r • {m{s,x) + B)) to satisfy the 5-closeness to the expected competitiveness. We 
now show how to handle large values of 111(3, x). To be r-competitive, we can afford a cost of 

r ■ optg > r ■ (m(s, x) + opt^;) > r ■ {m{s, x) + opt^// — B). 

If we choose the first answer greedily and apply A for all remaining requests, the expected cost of 
the solution is at most 

m{s, x) + r ■ optgi/ < m{s, x) + r ■ (opt^// — B) + r ■ B. 

Therefore, if m(s,x) > ^rrx^) the modified solution is r-competitive. 

□ 

The claim suggests to set a = ifzifB and r = 2e~^{a + r • (^r-j-B + B)), where we chose 
5 = e/2. From now on P' is the {a, r)-truncated version of P with these values of a and r. 

As before, let A be an online algorithm for P that computes a solution with expected 
competitive ratio at most r. We design an algorithm A" for P' as follows. Suppose in state s of 
P' , the adversary requests x. Then A" simulates A in state s on x within P. If m{s,x) < a and 
the answer y has a cost smaller than r, the answer of A' \s y. Otherwise A" ignores the answer of 
A and answers greedily while ignoring the destination state, and performing a reset subsequently. 

It is clear that all answers of A" are feasible for P' . We first show that the expected competitive 
ratio of A" for P' is at most r + e/2. For each round with m{s,x) < a, the claim follows directly 
from Claim [3] using that any answer in P with cost higher than r neither affects an optimal 
answer nor the algorithm's answer due to the claim. Otherwise, if m{s, x) > a, the competitive 
ratio of the greedy answer is at most r, using the same argumentation as in the proof of the 
second part of Claim [3j 

To summarize, P' is a symmetric, opt-bounded, and request-bounded problem and A" is 
an expected (r + e/2)-competitive algorithm for P' . Therefore, we can apply the restricted 
Theorem [1] as proven in the last section with an error of e/2 and with A" to show that there is 
an algorithm A' that is (r -|- e)-competitive for P' w.h.p. 
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Finally we show that the competitive ratio in P for any sequence of answers on any request 
string cannot be larger than the competitive ratio of the same sequence in P' . 

Observe that a string of answers is optimal for P if and only if it is optimal for P' . Due to 
the opt-boundedness, an optimal solution cannot have any answer on request x from state s that 
has a cost larger than m{s, x) + B in P or larger than a -\- B in P' . Therefore the parameter r 
does not influence any optimal solution in P' and it cannot be an advantage to give an answer in 
P that is set to a cost of oo in P'. In each time step, the difference of the cost of any answer y 
in P and P' given any state s and request x is fixed to exactly m{s, x) — a as long as the answer 
has finite cost. Thus, any improvement of the answer sequence in one of the problems translates 
to an improvement in the other one. 

Let z = zi, Z2, ■ ■ ■ , Zkhe an optimal sequence of answers and s'^, S2, . . . , s'^ be the corresponding 
sequence of states. Then it is sufficient to show that for each i, the competitive ratio of A' for P is 
at most as high as the competitive ratio for P' . For any i, let us fix a state s and a request x. Let 
y be the answer given by A'. Then the competitive ratio in P' is Costp/(s, x, y)/Costp/(s^, x, Zj). 
If m{s, x) < a, the cost of both the optimal answer and the algorithmic answer, and therefore 
also the ratio, is identical in P and P' . Otherwise, the ratio in P is 

Costp(s, X, y)/Costp(s'j, x, z-i) 

= (Costp'(s, X, y) + m(s, x) — cj)/(Costp'(s^, x, Zj) + m(s, x) — a) 
< Costp'(s, X, y ) /Cost p'(s'j, x, Zi), 

where the last inequality uses that any competitive ratio is at least one. 

4 Applications 

We now discuss the impact of Theorem [T] on task systems, the fc-server problem, and paging. 
Despite being related, these problems have different flavors when analyzing them in the context 
of high probability results. Finally, we show that there are also problems that do not directly fit 
into our framework but nevertheless allow for high probability results for specific algorithms. 

4.1 Task Systems 

The properties of online problems needed for Theorem [l] are related to the definition of task 
systems. There are, however, some important differences. 

To analyze the relation, let us recall the definition of task systems as introduced by Borodin 
et al. [5]. We are given a finite state space S and a function d: S x S ^ 1R+ that specifies the 
(finite) cost to move from one state to another. The requests given as input to a task system 
are a sequence of jS'l-vectors that specify, for each state, the cost to process the current task if 
the system resides in that state. An online algorithm for task systems aims to find a schedule 
such that the overall cost for transitions and processing is minimized. From now on we will call 
states in S system states to distinguish them from the states of Definition [Sj The main difference 
between states of Definition [8] and system states is that states and the distances between states 
depend on the requests provided as input and on the answers given by the online algorithm; this 
way there may be infinitely many states. States are also more general than system states in that 
we may forbid specific state transitions. 

Theorem 2. Let k be a randomized online algorithm with expected competitive ratio r for task 
systems. Then, for any e > 0, there is a randomized online algorithm k' for task systems with 
competitive ratio (1 + e)r w.h.p. (with respect to the optimal cost). 

Proof. In a task system, the system states are exactly the states according to our definition, 
because the optimal future cost only depends on the current system state and a future request 
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has the freedom to assign individual costs to each of the system states. In other words, an 
equivalence class s from Definition [s] (i.e., one state) consists of exactly one unique system state. 
To apply Theorem [l| we choose the constant B of the theorem to be m.axs^tes{d{s,t)}. This 
way, the problem is opt-bounded as one transition of cost at most B is sufficient to move to any 
system state used by an optimal computation. The problem is clearly partitionable according to 
Definition [6] as each round is associated with a non- negative cost. The adversary may also stop 
after an arbitrary request. 

The remaining condition of Theorem [l] that every state is initial formally conflicts with the 
definition of task systems, because usually there is a unique initial configuration that corresponds 
to a state sq. This problem is easy to circumvent by relabeling the states before each run (reset) 
of the algorithm, i. e., we construct an algorithm A" that is used instead of A. When starting the 
computation, A" determines the mapping and simulates the run of A on the mapped instance. 
Thus we are able to use Theorem [T] on A" and the claim follows. 

□ 

4.2 The fc-Server Problem 

The fc-server problem, introduced by Manasse et al. [16j, is concerned with the movement of k 
servers in a metric space. Each request is a location and the algorithm has to move one of the 
servers to that location. If the metric space is finite, this problem is well known to be a special 
metrical task system. The states are all combinations of k locations in the metric space and 
the distance between two states is the corresponding minimum cost to move servers such that 
the new locations are reached. Each request is a vector where all states but those containing 
the correct destination have a processing time oo and the states containing the destination have 
processing time zero. Using Theorem [2] this directly implies that all algorithms with a constant 
expected competitive ratio for the /c-server problem in a finite metric space can be transformed 
into algorithms that have almost the same competitive ratio w.h.p. 

If the metric space is infinite, an analogous result is still valid except that we have to bound 
the maximum transition cost by a constant. This is the case, because the proof of Theorem [2] 
uses the finiteness of the state space only to ensure bounded transition costs. 

Without the restriction to bounded distances, in general we cannot obtain a competitive 
ratio much better than the deterministic one w.h.p. 

Theorem 3. Let (M, d) be a metric space with \M\ = n constant, s ^ M be the initial position of 
all servers, £ a constant and let r be the infimum over the competitive ratios of all deterministic 
online algorithms for the k-server problem in (M, d) for instances with at most I requests. For 
every e > 0, there is a metric space {M',d') where for any randomized online algorithm R for the 
k-server problem there is an oblivious adversary against which the solution of R has a competitive 
ratio of at least r — e with constant probability. 

Proof. We obtain (M', d') as follows. The set M' is composed of copies of M \ {s}. Let, for each 
i S N, Mi denote the ith. copy of M in M' together with the point s (i. e., s is in each of the 
sets Mi). This way M = Mi. For any pair of points u,v £ M with copies Ui,Vi in Mi, we set 
d'(ui,Vi) = i ■ d{u,v); we call i the scaling factor of Mi. For any i ^ j, the distance between 
points in distinct copies of M is d'{ui,Vj) = d{s, Ui) + d{s, Vj). This way (M', d') is a metric and 
we can choose freely a scaling factor for the cost function d. 

We now describe an adversary Adv that uses oblivious adversaries for deterministic online 
algorithms as black boxes and has two parameters A and C that specify lower bounds on the 
number of requests and the cost of the optimal offline solution. Adv starts with A requests of 
the point s in M (i. e., the optimal cost after the first A requests is zero). Note that we cannot 
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assume A to be a constantE] 

Afterwards the adversary starts a second phase where it simulates a deterministic adversary in 
a suitably scaled copy of M. We assume without loss of generality that any considered algorithm 
is lazy, i.e., it answers requests by only moving at most one server (see Manasse et al. [l^). We 
choose as scaling factor j = C " '^^'^u,v&M{d{u, v)}. Adv sends all subsequent requests in Mj. 

Due to the laziness assumption, after the first A requests there are at most different 
possibilities to answer the subsequent £ requests (we can view an answer simply as the index of 
one of the k servers). Adding also all shorter request sequences, by the geometric series there are 
at most ^ < k^^^ possible answer sequences. Analogously, there are less than n^+^ possible 
request sequences of length at most H in Mj. Thus, the total number of algorithms behaving 
differently within at most £ requests is less than = (A;^+^)("^'''^) and therefore constant. 

Adv may choose one of at most ijj deterministic algorithms to play against. He analyzes the 
probability distribution of R's strategies after the first A requests. Then he selects one of the 
if) algorithms that corresponds to the strategy run by R with maximal probability. With Adv's 
choice of the algorithm, the competitive ratio of R is at least r — e with constant probability at 
least ^/j"^ and the choice of j ensures that the optimal cost is at least Q. 

□ 

Corollary 3. If we allow the metric to he infinite, then there is no (k — e)- competitive online 
algorithm w.h.p. for the k-server problem for any constant e. 



We simply use that the lower bound of Manesse et al. 16 satisfies the properties of Theorem p 



4.3 Paging 

In the paging problem there is a cache that can accommodate k memory pages and the input 
consists of a sequence of requests to memory pages. If the requested page is in the cache, it can 
be served immediately, otherwise some page must be evicted from the cache, and be replaced 
by the requested page; this process is called a page fault. The aim of a paging algorithm is to 
generate as few page faults as possible. Each request generates either cost (no page fault) 
or 1 (page fault), and the overall cost is the sum of the costs of the requests. Paging can be 
seen as a A:-server problem restricted to uniform metrics where all distances are exactly one. In 
particular, the transition costs in that metric are bounded. Hence, the assumptions discussed 
in the previous subsection are fulfilled, meaning that for any paging algorithm with expected 
competitive ratio r there is an algorithm with competitive ratio r(l + e) w.h.p. 

Note that the marking algorithm is analyzed based on phases that correspond to A; + 1 distinct 
requests, and hence the analysis of the expected competitive ratio immediately gives the 2Hk — 1 
competitive ratio also w.h.p. However, e.g., the optimal algorithm with competitive ratio H/. — 1 
due to Achlioptas et al. [2] is a distribution-based algorithm where the high probability analysis 
is not immediate; Theorem [T] gives an algorithm with competitive ratio Hk{l + e) w.h.p. also in 
this case. 



4.4 Job Shop Scheduling with Unit Length Jobs 

In Section [5] we will show that none of the conditions of Theorem [T] can be omitted. However, 
there are problems that do not fit the assumptions of the theorem, and still can be solved almost 
optimally by specific randomized online algorithms with high probability. We use, however, a 
weaker notion of high probability than in the previous sections. 

''Without the first A requests, for a fixed onhne algorithm the only way to access more than constantly many 
random bits within I requests is to use the random bits to decide on further access to the random tape. But 
then we could fix a constant probability to only access constantly many random bits. Thus, omitting A would 
strengthen the adversary and weaken this lower bound result more than it is acceptable. 
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Figure 2: An example input with two jobs each of size 15 and two strategies. Obstacles are marked by filled cells. 



Consider the problem job shop scheduling with unit length tasks (Jss for short) defined as 
follows: We are given a constant number of n jobs Ji to Jn that consist of m tasks each. Each such 
task needs to be processed on a unique one of m machines which are identified by their indices 
1,2, ... ,m, and we want to find a schedule with the following properties. Processing one task 
takes exactly 1 time unit and, since all jobs need every machine exactly once, we may represent 
them as permutations Pi = {p\,P2, ■ ■ ■ ,Pm) of the machine indices, where p'j G {1, 2, . . . , m} for 
every i G {1, 2, . . . ,n} and j G {1, 2, . . . , m}. All Pi arrive in an online fashion, that is, the 
(k + l)th task of Pi is not known before the kth task is processed. Obviously, as long as all jobs 
request different machines, the work can be parallelized. If, however, at one time step, some of 
them ask for the same machine, all but one of them have to be delayed. The cost of a solution is 
given by the total time needed for all jobs to finish all tasks; the goal is to minimize this time 
(i.e., the overall makespan). 

In the following, we use a graphical representation that was introduced by Brucker |6j. Let 
us first consider only two jobs Pi and P2. Consider an (m x m)-grid where we label the rc-axis 
with Pi and the y-axis with P2. The cell models that, in the corresponding time step. 

Pi processes a task on machine pj while P2 processes a task on p|. A feasible schedule for the 
induced instance of Jss is a path that starts at the upper-left vertex of the grid and leads to the 
bottom right vertex. 

It may use diagonal edges whenever pj p'j- However, if pj = p|, both Pi and P2 ask for the 
same machine at the same time and therefore, one of them has to be delayed. In this case, we 
say that Pi and P2 collide and call the corresponding cells in the grid obstacles (see Fig. [2] for 
an example with m = 15). If an algorithm has to delay a job, we say that it hits an obstacle 
and may therefore not make a diagonal move, but either a horizontal or a vertical one. In the 
first case, P2 gets delayed, in the second case, Pi gets delayed. Note that, since Pi and P2 are 
permutations, there is exactly one obstacle per row and exactly one obstacle per column for 
every instance, therefore, m obstacles overall for any instance. The graphical representation 
generalizes naturally to the n-dimensional case. 



The problem has been studied previously, for instance in u^lpl^ 10 , 12 . Hromkovic et al. 10 



showed the existence of a randomized online algorithm R that achieves an expected competitive 
ratio of 1 + 2n/-y/m, for n = o{y/m), assuming that it knows m. R depends on diagonals in the 
grid; intuitively (in two or three dimensions), a diagonal in the grid is the sequence of integer 
points on a line that is parallel to the line from the coordinate (0, 0, ... , 0) to (m, m, . . . , m). 
More precisely, let V be the convex hull of the grid. Then a diagonal is a sequence of integer 
points d = {(i*}i such that d^ is in the facet of P that contains the origin (0, 0, . . . , 0), is in the 
facet containing the destination (m, m, . . . , m), none of the two points is in a smaller-dimensional 
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face, and we obtain d*"*"^ from d* by increasing each coordinate by exactly one. As shown by 
Hromkovic et al. [To|, the number of diagonals that start at points with all coordinates at most r 
is exactly n • r^~^. 

A diagonal template D with respect to r and d is a sequence of consecutive points in the 
grid that starts from (0, 0, . . . , 0), moves to d^, visits each point of d and finally moves to the 
destination (m, m, . . . , m). To reach d^, D delays each job Pj by r — d\ time units in the begining 
and delays each job Pi by d\ time units upon reaching the last point of the diagonal. Thus, 
a schedule that follows a diagonal template without delays has a length of exactly m + r. A 
diagonal strategy with respect to a diagonal template D is a minimum-length schedule that 
visits each point of D. Note that an online algorithm has all necessary information to run a 
diagonal strategy, because when reaching an obstacle, all possible ways to the subsequent point 
are available; an example of a diagonal strategy is depicted in Fig. [2| 

The randomized algorithm R fixes the value r and chooses uniformly at random a diagonal d 
with 1 1 d^ I loo < f'-, then it follows the corresponding diagonal strategy. 



Theorem 4. For any r = o{^/m) there is an online algorithm for Jss that is (1 + /{m))- 
competitive with probability Om{l), for any f{m) = uj{l/r). 

Proof. We already mentioned that R chooses one of n ■ r"~^ diagonals. It is also known that the 
total number of delays in all diagonal strategies caused by obstacles is at most m • (2) • (n — 1^ 
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[Toj. Clearly, any schedule has a length of at least m. Thus, in order to be (l-|-/(m))-competitive, 
we need a diagonal strategy such that (m + r + d)/m < (1 + f{m)), where d is the number of 
delays due to obstacles. Let b be the number of diagonals considered by the algorithm such that 
the corresponding diagonal strategies have more than d < 'mf{m) — r delays caused by obstacles. 
Then, to show our claim, we have to ensure that b/{n ■ r""-*^) = Om(l)- 

The value of b is maximized if we assume that any diagonal has either no obstacles or the 
delay is exactly mf{m) — r. Therefore, 



m • (2) • (n - 1) • 



b < 



mf{m) — r 

Since the dimension n is a constant, the claim follows from 
m-{l)-{n-l)- r"-2 m • _ 



{mf{m) — r)n ■ r" ^ {mf{m) — r) ■ r 

□ 



5 Necessity of Requirements 

As mentioned above, our result holds with large generality as many well-studied online problems 
meet the requirements we imposed. However, the assumptions of Theorem [l] require that the 
problem at hand 

1. is partitionable, 

2. every state is equivalent to some initial state, and 

3. Vs,s',x: |Cost(DPTs(x)) - Cost(OPTy (x))| < B. 

As stated before, partitionability is not restrictive; every problem can be presented as a 
partitionable one. We now show that removing any of the conditions [2] and [3] allows for a 
counterexample to the theorem. For the purpose of this discussion, let s and s' in condition [s] 
range over all initial states to have it defined also for non-symmetric problems. 
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First, let us consider the following online problem where condition [2] is violated, i.e., where 
not every state is equivalent to some initial sta^te. There Eire ti -|- 2 requests xq^ • • • ? 

The 

request x_i = is a dummy request. The request xq S {0, 1} is a test: if y_i = xq the test is 
passed, otherwise the test is failed; the cost of y_i and yo is always zero. For the remaining 
requests xi, . . . , x„ we have Xi £ {0, 1}. The cost of yi, for i = 1, . . . , n, is 1 if the test has been 
passed, or if = Xj. Otherwise, the cost of yi is 5. The cost of y„ is zero. The problem is 
clearly partitionable. There are six states: the initial state, then two possible states to guess 
the test, then one state for processing all requests with the test passed, and two states for 
processing requests with the test failed, based on the value of the previous answer. From any 
state, however, the optimal value of the remaining sequence of m requests is between m and 
m + 6. A randomized online algorithm that guesses each time independently has probability 1/2 
to pass the test incurring a cost of n, and probability 1/2 to fail, in which case, for any subsequent 
request, it pays 1 with probability 1/2, and 5 with probability 1/2. Putting everything together, 
the expected cost is 2n, so r = 2. On the other hand, for any randomized algorithm, there is 
an input for which it has probability at least 1/2 of failing the test, and then on each request 
probability at least 1/2 of a wrong guess. From symmetry arguments we conclude that, once the 
test is failed, the probability that the algorithm makes at least n/2 — 1 wrong guesses is at least 
1/2. Hence, with probability at least 1/4 the cost of the algorithm is at least 3n — 4, so it cannot 
be c-competitive w.h.p. for any c < 3. 

Next, let us remove condition [3j We have seen a hint to the necessity in Theorem |3j but 
currently no randomized online algorithm for the fc-server problem is known to have a competitive 
ratio better than 2k — 1 independent of the size of the metric space. Therefore we give a second 
unconditional argument. Let us consider the following problem: the states are pairs {s,t) where 
s £ {0, 1}, t G N, and any state can be an initial one. Processing the request in state (s,t) 
produces the answer yi S {0, 1}; the cost of yi is 2* if s = rj, and 3-2* if s / r^. After processing 
the request, the new state is + 1). It is easy to verify that the problem is partitionable 
and that the states are in accord with Definition [8j Also, it is easy to check that the worst-case 
expected ratio of the algorithm that produces random answers is 2. On the other hand, consider 
inputs that start from state (0,0) with xi = 0. The optimal cost is 2" — 1, however, any 
randomized algorithm has probability at least 1/4 of incurring cost 9 • 2""^ (by failing the two 
last requests). 

6 Conclusion 

Our result opens several new questions. For instance, our results, so far, are only shown for 
minimization problems. Also note that our analysis does not hold for the notion of strict 
competitiveness (i. e., a = 0) for arbitrary input sizes. Furthermore, the assumption that all input 
strings are feasible for all states (implied by the opt-boundedness) may allow for relaxations. 

Until now, we only focused on upper bounds on the competitive ratio. Our results, however, 
also open a potential lower bound technique: if a problem satisfies our requirements, a lower 
bound w.h.p. implies a lower bound of almost the same quality in expectation. In this context 
it is natural to ask for the requirements of problems for a complementary result. How can we 
determine the class of problems such that each algorithm that is r-competitive w.h.p. can be 
transformed into an algorithm that is almost r-competitive in expectation? 

Finally, we would like to suggest the terminology to call a randomized online algorithm A 
totally r-competitive if, for any positive constant e, A is c-competitive in expectation and we may 
use Theorem [T] to construct an online algorithm that is (r + e)-competitive w.h.p. Analogously, 
an online problem is totally c-competitive if it admits a totally r-competitive algorithm. 
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